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Human amphiregulin (AR) is a heparin-binding 
growth factor which functions by binding to and acti- 
vating the epidermal growth factor (EGF) receptor ty- 
rosine kinase. AR contains an EGF-like domain (resi- 
dues 44-84) and a Lys/Arg-rich NH^- terminal extension 
(residues 1-43). Synthetic peptides corresponding to 
residues 8-26, 28-44, and 68-84 of AR were tested for 
their ability to compete for the binding of AR to immo- 
bilized heparin. AR*" 26 and AR 68 "* 4 had no significant 
effect on the binding of AR to heparin, whereas AR* 8- " 
bound to heparin and blocked the binding of AR to hep- 
arin. Both soluble heparin and heparan sulfate inhib- 
ited AR- induced mitogenesis in MCF-10A human mam- 
mary epithelial cells with an IC M of 5 and 2 ug/ml, 
respectively, whereas soluble chondroitin sulfate had 
only a slight inhibitory effect. When MCF-10A cells were 
grown in the presence of chlorate, an inhibitor of sulfa- 
tion, or exposed to the glycosaminogly can-degrading en- 
zymes heparitinase or heparinase, the ability of AR to 
evoke mitogenesis in these cells was lost. Chlorate, 
heparitinase, or heparinase treatment inhibited AR- 
induced autophosphorylation of tyrosine residues in the 
EGF receptor. None of these treatments had any signif- 
icant effect on EGF-triggered mitogenic signaling by the 
EGF receptor. These results indicate that extracellular 
heparan sulfate glycosaminoglycan is essential to AR- 
induced mitogenic signaling by the EGF receptor tyro- 
sine kinase. 



Human amphiregulin (AR) 1 is a heparin-binding polypeptide 
growth regulator which consists of an epidermal growth factor 
(EGF)-like domain and a very basic Nr^-terminal extension 
which contains glycosylation sites and putative nuclear local- 
ization signals (1-3). AR influences the proliferation of cells by 
binding to the extracellular domain of the EGF receptor 
(EGFR) which results in autophosphorylation of the EGFR, 
activation of the EGFR tyrosine kinase, and rapid tyrosine 
phosphorylation of a number of cellular substrates including 
pl85 er6B2 (4). AR stimulates the proliferation of normal and 
malignant epithelial cells, fibroblasts, and keratinocytes (1-7). 
In vivo, AR is expressed by a large number of normal tissues (8) 



* The costs of publication of this article were defrayed in part by the 
payment of page charges. This article must therefore be hereby marked 
"advertisement" in accordance with 18 U.S.C. Section 1734 solely to 
indicate this fact. 

$ lb whom correspondence should be addressed: Division of Cytokine 
Biology, CBER, FDA, HFM-611, Bldg. 29A, 8800 Rockville Pike, 
Bethesda, MD 20892. Tel.: 301-496-9012; Fax: 301-402-1659. 

1 The abbreviations used are: AR, human amphiregulin; EGF, human 
epidermal growth factor; EGFR, epidermal growth factor receptor; FGF, 
fibroblast growth factor; HB-EGF, heparin-binding EGF-like growth 
factor; GAG, glycosaminoglycan; HS, heparan sulfate; TGF-a, trans- 
forming growth factor-a; PAGE, polyacrylamide gel electrophoresis; 
SDGF, schwannoma-derived growth factor. 



but appears to be localized exclusively to the epithelium of the 
human colon (7, 9), stomach (10, 11), breast (12), and pancreas 
(13). AR has been shown to drive the proliferation of human 
colon carcinoma cells via an autocrine mechanism (7) and is 
commonly overexpressed in human cancers of the colon (7, 9, 
14, 15), breast (12, 16), stomach (10, 11, 15), and pancreas (13). 

Heparin affinity chromatography has been utilized to purify 
AR from the conditioned medium of human keratinocytes (2) as 
well as phorbol ester-treated human breast carcinoma cells (3), 
and 30 ug/ml of soluble heparin has been shown to inhibit the 
ability of AR to stimulate the growth of keratinocytes (2). In 
addition to AR, a number of heparin-binding growth factors 
have been discovered within the last several years which con- 
tain an EGF-like domain such as schwannoma-derived growth 
factor (SDGF) (17), heparin-binding EGF-like growth factor 
(HB-EGF) (18), betacellulin (19), heregulin (20), and neu dif- 
ferentiation factor (21). The bioactivity of AR appears to be 
mediated exclusively through the EGFR (4, 7), as may be the 
case for SDGF, HB-EGF, and betacellulin, whereas the action of 
heregulin and neu differentiation factor appears to involve the 
er6B2, er&B3, and/or er6B4 EGFR-like tyrosine kinases (22- 
25). Since it is very unlikely that these growth factors would 
ever encounter heparin in vivo (26), the physiological signifi- 
cance of their ability to bind heparin is not clear. Proteoglycans 
are proteins which contain covalently attached sulfated glyco- 
saminoglycan (GAG), exist on the surface of cells and in the 
extracellular matrix and are believed to play important roles in 
a wide range of biological processes which include cell division, 
morphogenesis and cancer (26-29). One important subset of 
these molecules is the heparan sulfate (HS) proteoglycan (30- 
32) whose HS chain is structurally related to heparin, but in 
general, is sulfated to a lesser degree. HS proteoglycan has 
been shown to be obligatory for the mitogenic activity of basic 
and acidic fibroblast growth factor (FGF) (33-37) and vascular 
endothelial growth factor (38). HS also appears to play an im- 
portant role in the HB-EGF stimulation of smooth muscle cell 
migration (39). 

Recently, we isolated multiple, structurally distinct forms of 
AR (3). The predominant ~16.5-kDa forms contained sialic 
acid-rich complex iV-linked oligosaccharide, in addition to O- 
linked carbohydrate. However, a non-glycosylated — 9.5-kDa 
species was also isolated which contained an intact EGF-like 
core, but had a truncated NH^-terminal extension. All of these 
forms bound strongly to heparin and were biologically active, 
demonstrating that the oligosaccharide moieties and the ex- 
treme NH a - terminal region of AR are not essential to heparin- 
binding nor bioactivity (3). This previous work also suggested 
that the ability of AR to bind heparin may be related to its 
ability to activate the EGFR tyrosine kinase. In this report, we 
provide strong evidence that extracellular HS GAG is essential 
to AR-triggered mitogenic signaling by the EGFR. 

EXPERIMENTAL PROCEDURES 

Purification ofAR — Human AR was purified to homogeneity from the 
conditioned medium of phorbol 12-myristate 13- acetate-treated MCF-7 
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hviman breast carcinoma cells by sequential heparin affinity, immune- 
affinity, and reverse phase-high performance chromatography as de- 
scribed in Johnson el al. (3). 

Preparation of AR Peptides—? Peptides corresponding to residues 
8-26 and 26-44 of AR were synthesized and purified as described pre- 
viously (7). The peptide corresponding to residues 68-84 of AR was 
prepared as described in Johnson et al. (3). 

Binding of AR to Immobilized //eport'n-— Twenty-five ng of AR and 5 
pi of heparin cross-linked to agarose (—1 ug heparin/pl resin; Sigma) 
were added to 300 pi of 20 mM He pes. 50 dim NaCl, pH 7.4 (buffer) in the 
absence or presence of 20 ug of AR peptide. The mixture was rotated end 
over end for 4 h at 4 °C and centrifuged, and the pellet was washed 
three times with 1 ml of buffer. The pellet was boiled for 5 min in 20 pi 
of SDS-polyacrylamido gel electrophoresis (PAGE) sample buffer. SDS- 
PAGE and Western blotting were performed as described in Johnson et 
al. (3). 

Digestion of AR with NGlyeosidase F — One unit of Af-glycosidase F 
(Boehringer Mannheim) was added to 100 ng of AR in 300 ul of 20 mM 
Hepcs, 50 m>i NaCl, pH 7.4, and incubated for 4 h at 37 °C. 

Cell Culture and Mitogenesis Assay — MCF-10A human mammary 
epithelial cells were cultured and the mitogencsis assay was performed 
as described previously (3, 4). Briefly, 64 h after the addition of 250 pM 
AR or EOF (Life Technologies, Inc.), cells in 96- well plates were pulsed 
for 6 h with ( 3 H Ithymidine (2 uCi/well; Amersham Corp.), DNA was 
harvested, and the incorporation of [ 8 H Ithymidine into DNA was 
quantitated. 

EGFR Autophosphorylation Assay — Ltgand -induced autophospho- 
rylation of tyrosine residues in the EGFR was measured as described in 
Johnson et al. (4). MCF-10A cells were plated into 100-mm dishes at a 
density of 785,000 cells per dish and after 2 days of growth were stimu- 
lated with 250 p M AR or EGF for 9 min at 37 °C. The EGFR was 
immunoprecipitated using E7 antiserum directed against the cytoplas- 
mic domain of the human EGFR (40), fractionated in an 8% SDS-PAGE 
gel, and transferred to a polyvinyl dill uoride membrane, and tyrosine- 
phosphorylated EGFR was detected using biotinylated PY-20 antibody 
(ICN Biomedicals), strbptavidin-horaeradish peroxidase conjugate, and 
enhanced chemiluminescence (Amersham). 

Western Blotting Analysis of EGFR — Prior to immunoprecipitation of 
the EGFR in the autophosphorylation assay, aliquots of total cell crude 
lysates were taken to evaluate cellular EGFR levels. Proteins were 
fractionated in an 8% polyacrylamide SDS-PAGE gel and transferred to 
a polyvinyl difluoride membrane, and EGFR was detected using E7 
antiserum (40), the Vect as tain ABC Elite kit (Vector Laboratories), and 
enhanced chemiluminescence. 

RESULTS 

Residues 26-44 of AR Interact with Heparin — The binding of 
AR to immobilized heparin has greatly facilitated the purifica- 
tion of AR derived from the conditioned medium of human 
keratinocytes (2) and human breast carcinoma cells (3). To 
study the interaction of AR and heparin, a micro-assay was 
developed in which AR can he bound to a small quantity of 
heparin that has been cross-linked to agarose (5 ul). After 
washing the resin, bound AR can be released by boiling the 
resin in SDS-PAGE sample buffer. The AR is then fractionated 
in an SDS-PAGE gel and detected by Western blot analysis 
(Fig. 1). To identify hepariivbinding regions in the AR molecule, 
various synthetic peptides which correspond to distinct regions 
of AR were tested for the ability to block the binding of AR to 
immobilized heparin. Peptides which correspond to residues 
8-26 and 68-84 had no sijgniiicant effect on the binding of AR 
to immobilized heparin (Fig. 1A). However, the peptide corre- 
sponding to residues 26-44 bound to heparin, as evidenced by 
the fact that when the Western blot was performed using an- 
tibodies directed against residues 26-44 it resulted in a very 
strong immunopositive streak running down that lane of the 
gel (Fig. 1A). To confirm that AR 2<W4 bound to the same. site on 
the heparin molecule as did AR and thus, could compete for the 
binding of AR to heparin, the experiment was repeated and AR 
was detected in the Western blot using an antibody directed 
against residues 8-26 of AR (AR-Ab3). Since there are two 
potential AMinked glycosylation sites within residues 8-26 of 
AR, it was necessary to first digest AR with N-glycosidase F so 
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Fig. 1. Effect of AR peptides on the binding of AR to immobi- 
lized heparin.. Twenty-five ng of AR were added to 5 pi of heparin- 
agarose (—5 pg of immobilized heparin ) in 300 pi of 20 rriM Hepes, 50 mM 
NaCl, pH 7.4 (buffer), in the absence or presence of 20 pg of peptide 
corresponding to residues 8-26 of AR (AR 8-26), 26-44 (AR 26-44), or 
68-84 (AR 68-84 ). After incubation for 4 h at 4 °C, the resin was washed 
three times with 1 ml of buffer and boiled in SDS-PAGE snmple buffer. 
The sample was fractionated. by SDiS-PAGE under reducing conditions 
in a 16% acrylamide gel and transferred to a nitrocellulose membrane. 
The positions and molecuiar mass of marker proteins are shown to the 
left in lulodaltons. In A, the nitrocellulose membrane was probed with 
AR-Ab2 antibodies directed against residues 26-44 of AR (3, 7). In R, AR 
was digested with Af-glycosidase F prior to performing the heparin- 
binding assay and the Western blot was probed with AR~Ab3 antibodies 
directed against residues 8-26 of AR (3). The specificity of the antibod- 
ies used was confirmed by performing Western blot analyses using 
purified control preimmune antibodies (data not shown). 

that these antipeptide antibodies could recognize the molecule 
in Western blot analysis (3). As shown in Fig. IB, the AR 2 ^' 1 
peptide completely blocked the binding of AR to heparin. These 
results demonstrate that residues 26-44 constitute a heparin- 
binding region in the AR molecule. 

Effect of Soluble GAGs on AR Induced Mitogenesis in Human 
Mammary Epithelial Cells — rVarious soluble GAGs were then 
tested for the; ability to affect mitogenesis evoked by AR and 
EGF in MCF-10A human mammary epithelial cells. MCF-10A 
is an immortalized non transformed cell line in which the action 
of AR is mediated solely by the EGFR tyrosine kinase (4). 
Previous work from our laboratory has demonstrated that in 
MCF-10A cells there is an excellent correlation between AR- 
and EGF-driven increases in cell number and increases in the 
incorporation of pHjthymidine into DNA (DNA synthesis) (3, 
4). As shown in Fig. 2, both soluble heparin (Panel A) and HS 
(Panel B) inhibited mitogenesis induced by 250 pM AR with an 
IC r>0 of 5 and 2 ug/ml, respectively. However, even at very high 
concentrations of soluble heparin or HS (100 ug/ml) complete 
inhibition of AR action was not observed in MCF-10A cells (Fig. 
2). Soluble chondroitin sulfate was found to have only a slight 
inhibitory effect on AR-induced DNA synthesis (Fig. 2C) and 
none of the three GAGs had any significant effect on mitogen- 
esis triggered by EGF (Fig: 2, A-C ). These results suggested 
that soluble HS/heparin compete with an extracellular HS-like 
GAG molecule for binding of AR and that the interaction be- 
tween this extracellular HS-like molecule and AR is important 
to the eventual activation of the EGFR. 

Chlorate Inhibits AR-triggered Mitogencsis and EGFR Auto- 
phosphorylation in MCF-10A Cells — To; determine if HS GAG 
produced: by ;MCF-i0 A cells is critical to AR-induced mitogenic 
signaling by the EGFR, two distinct approaches were used. 
First, cells were grown in the presence of chlorate to specifically 
interfere with the proper biosynthesis of the sulfated GAG 
chains. Chlorate is a competitive inhibitor of ATP sulfurylase 
action because it competes with the recognition of sulfate by the 
enzyme (41). Growth: of cells in the presence of chlorate results 
in reduced sulfation of GAGs (33); MCF-lOA cells which were 
cultured in 10 mM sodium chlorate lost the ability to respond to 
exogenous 250 pM AR (Fig. 3A ). However, chlorate also had a 
slight inhibitory effect on mitogenesis induced by 250 pM EOF 
(Fig. 3£). To confirm that the inhibition of cell growth caused 
by chlorate is via the competitive inhibition of sulfation, 5 mM 
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Fig. 2. Effect of sul fated gly cosam inoglycnns on AR-i nduccd 
mitogenesis in human mammary epithelial eel Is.. MCF-1 OA cells; 
were cultured as described previously (4), plated into 96- well plates at 
a density of 2000 cells/well and a mitogenesis assay was. performed (3) 
by adding 250 p.\j AR or EOF in the absence or presence of various 
concentrations of soluble heparin (A), heparan sulfate (B), or chon- 
droitin sulfate (C) (Sigma). DNA synthesis was determined by quanti- 
fying the incorporation of [*H {thymidine into DNA. Percent inhibition of 
DNA synthesis for each concentration of glycosaminoglycan was calcu- 
lated relative to the level of DNA synthesis achieved in the presence of 
growth factor and the absence of the glycosaminoglycan. Data points 
represent the mean ± S.E. of experiments performed in triplicate. Val- 
ues Tor incorporation of l :t H {thymidine into DNA in the absence of gly- 
cosaminoglycan for the control (no growth factor), AR and EOF treat- 
ment were 8,512, 49,975, and 56.840 cpm/well, respectively. 
Chondroitin sulfate is a mixture of chondroitin sulfate A and C (Sigma). 
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Fig. 3. Inhibition by chlorate of AR^triggered mitogenic signal- 
ing by the EGFR, MCF-1()A celts were cultured for 3 days in medium 
(4) which lacked or contained 10' mw sodium chlorate (CIO 3) with or 
without an. additional .5 m.M sodium sulfate (S0J~). The cells were 
trypsin ized : and plated 1 in the appropriate medium into 96-well plates 
(2,000; cell s/weli) for the mitogenesis assay (A and B) (3) or plated into 
1.00-mm dishes (785,000 cells/dish) for the EGFRautophosphorylation 
assay (C); Mitogenesis lriduced : :*by-.250" -pM-.AR- (A) or EGF (B) was 
measured by quantifying the incorporation of f 1 !!) thymidine into newly 
syritheslzed ! PNA .(cpni^well)i after a : 64 -h exposure to growth factor. 
Data ; points represent : the : mean ± S.E; of experiments performed in 

triplicate^The ^ EGFR autophpsphorylation; assay (C 6s 
described p^ for 
9 min and lysed; and the EGFR was immunoprecipitated : using E7 
antiserum (40). The - EGFR; was fractionated, in ah 8% pblyacryl amide 
SDS-PAGE gel, transferred to a polyvinyl difluoride membrane,, and 
tyros i ne- p hosphpryl ated 'EGFR ; Wa s d etectcd usi ng biofci h yl a ted P Y-20 
antibody strep tavidin-horseradish: peroxidase conjugate, and enhanced 
chemiluminescehce. The bottom of Panel G represents a Western blot 
analysis of an aliquot of total cell crude lysate from each experimental 
dish which was probed with the anti-EGFR E7 antiserum. 

sodium sulfate was added to the medium in the presence of 10 
m»M chlorate. Sulfate partially rescued the response of the cells 
to AR(Fig. 3A), whereas sulfate had no significant effect on the 
minor inhibition that chlorate had on EGF-evoked mitogenesis 
(Fig. 3B ). These results indicate that the response of the cells to 
AR is dependent upon normal cellular ATP sulfurylase function 
whereas, the response of the cells to EGF, as expected, is inde- 
pendent of ATP sulfury lase activity 
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Mitogenic signaling by the EGFR is believed to occur by 
I igand- triggered dimerization and autophosphorylation of the 
EGFR by an intermolecular trans -mechanism (42). This acti- 
vation of the EGFR via autophosphorylation on tyrosine resi- 
dues results in access of the EGFR tyrosine kinase domain to 
cytosolic substrates and recruits signaling molecules with Src 
homology domains (SH2) which can specifically interact with 
tyrosine-phosphorylated regions of the EGFR. Acti vation of the 
EGFR via autophosphorylation appears to be a necessary phe- 
nomenon for AR-induced mitogenesis (4). Thus, we investigated 
the effect that chlorate had on the ability of AR to trigger 
autophosphorylation of the EGFR. Growth of the cells in 10 nut 
sodium chlorate significantly reduced the ability of 250 pM AR 
to drive autophosphorylation of tyrosine residues on the EGFR 
and the response of the EGFR to AR was .completely rescued by 
5 mM sodium sulfate (Fig. 3C). Neither chlorate nor chlorate 
plus sulfate had any significant effect on EGF-induced auto- 
phosphorylation of the EGFR (Fig. 3C). Western blot analysis 
of cell lysates with anti-EGFR antibodies demonstrated that 
the treatment of the cells with chlorate had no significant effect 
on EGFR levels in the MCF-10A cells (bottom of Fig. 3C). 
Therefore, in the case of AR, autophosphorylation of the EGFR 
and the resultant mitogenic signaling by the EGFR is depend- 
ent upon the proper biosyntheis of a sulfated molecule. 

Treatment of Cells with Heparitinase or Heparinase Inhibits 
AR -induced Mitogenic Signaling by the EGFR — The second ap- 
proach used to study the role that GAGs play in mitogenesis 
elicited by AR involved the utilization of enzymes which cleave 
GAG chains at specific sites. Chondroitinase ABC (EC 4.2.2.4) 
catalyzes the removal of dermatan sulfate and chondroitiri sul- 
fate side chains of proteoglycans (43) whereas heparitinase (EC 
4.2.2.8) and heparinase (EC 4.2.2.7) cleave distinct sites within 
HS GAG chains (44). Exposure of MCF-10A cells to either he- 
paritinase or heparinase almost completely blocked the ability 
of AR to drive mitogenesis, whereas chondroitinase ABC had no 
effect on this phenomenon (Fig. 4A). Heparitinase and hepari- 
nase inhibited the Alt-stimulated growth of the cells by apH 
proximately 93 and 81%, respectively. Conversely, the stimula- 
tion of cell division elicited by EGF was. not significantly 
affected by these GAG-degrading enzymes (Fig. 4B). Treatment 
of the cells with heparitinase or heparinase prior to the addi- 
tion of exogenous AR dramatically inhibited activation of the 
EGFR as evidenced by the lack of AR-triggered autophospho- 
rylation of tyrosine residues in the EGFR (Fig. 4C). In contrast, 
exposure of the cells to chondroitinase ABC had little or no 
effect on the response of the EGFR to AR (Fig. AC). These 
enzymes did not significantly alter the ability of EGF to acti- 
vate the EGFR nor did they affect EGFR levels in the MCF-10A 
cells (Fig. 4C, bottom panel). These results demonstrate that 
the sulfated GAG which is critical to AR-induced mitogenic 
signaling by the EGFR is structurally very similar to HS 
hut does not appear to be related to dermatan or chondroitin 
sulfate. 

DISCUSSION 

AR is a potent stimulator of proliferation in a number of 
different cell types including normal and malignant epithelial 
cells, fibroblasts and keratinocytes (1—7). Overexpression of AR 
has often been observed in human malignancies of the breast, 
colon, stomach and pancreas (7, 9-16) and in human colon 
carcinoma cells AR can function as an autocrine growth stim- 
ulator (7). However, AR is also expressed by epithelial cells in a 
number of normal human tissues including the mammary 
gland (7, 9-13). AR has been shown to act as an autocrine 
growth factor for normal human mammary epithelial cells (45; 
46) and to function as an autocrine growth stimulator in MCF- 
10A cells when they are transformed by oyerexpressioh of ac- 
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Fig. 4; Inhibition of AR-induced EGFR mitogenic signaling by 
heparitinase and heparinase. MGF-10A cells were: plated into 96- 
wel J plates at a density of 2,000 cells/well and a mitogenesis assay was 
performed (3) by adding 250 pM AR (A ) or EGF (B) in -the absence or 
presence of 0.017 unit/ml of chondroitinase ABG iCH), heparitinaso 
(HT), or heparinase •(///?) (ICN Biomedicals). Every 12 h, 5 :ml of me- 
dium with or without 0.00,17 unit of the appropriate enzyme wore added 
to each well. Mitogenesis was measured by quantifying the incorpora- 
tion of : ( s HJthymidine into newly synthesized DNA fcountsAn in/well) 
after a 64 h. exposure to growth factor. Data points represent the mean 
± S.E. of experiments performed in triplicate. The EGFR autophos- 
phorylation assay (C) was performed exactly as described under ''Ex- 
peri mental. Procedures'* except that, prior to exposure to 250 pM AR or 
EGF, glycosaminoglycan chains were digested by treating the cells for 1 
h at 37 °C with 0:02 unit/ml of chondroitinase ABC (Cr7), heparitinase 
(HT\or heparinase (HR). The bottom of Panel C represents a Western 
blot analysis of an aliquot of total cell crude lysate from each experi- 
mental dish which was probed with the anti-EGFR E7 antiserum. 

tivated ras or the EGFR-like tyrosine kinase erbB2 (47). The 
work which we have reported here demonstrates that extracel- 
lular HS GAG chains play a very important role in AR action in 
human mammary epithelial cells and are essential to the mi- 
togenic activation of the EGFR which: is evoked by AR. 

It seems most probable that these HS chains exist on the 
surface of the cell where they can be covalently linked to an 
integral membrane protein as in the case of the.syndecans (30, 
31) or linked to cell surface lipid via glycosyl phosphatidyl inosi- 
tol (PI) as in the. case: of glypican (32, 48). Treatmentof.MCF- 
10A cells with phosphatidylinositol-specific phospholipase C 
had no effect on AR-iriduced mitogenesis suggesting that the 
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HS chains essential to AR functioning are not phosphatidyl- 
inositol-anchored. 2 It has been shown that a high affinity re- 
ceptor for acidic FGF contains covalently attached HS (37). 
However, even though there are 4 potential GAG attachment 
sites in the extracellular domain of the EGPR (49) treatment of 
the cells with the GAG-degrading enzymes had no apparent 
effect on the migration of the EGFR under conditions of reduc- 
ing SDS-PAGE (Fig. 4C, bottom panel). Thus, the HS chains 
which are critical to AR-triggered mitogenic signaling by 
the EGFR do not appear to be covalently linked to the EGFR 
itself, but may be attached to one or more syndecan-like 
proteoglycans. 

On formalin-fixed A431 cells and immobilized plasma mem- 
brane preparations, AR is significantly less effective than EGF t 
at competing for the binding of m I-EGF to the EGFR (1). Con- 
versely, AR is as potent as EGF in stimulating mitogenesis in 
MCF-10A cells (4, 47). One interpretation of these findings is 
that on living cells, HS chains stabilize a mitogenic signaling 
complex between AR and the EGFR. In the case of basic FGF, 
soluble heparin can substitute for the presence of the HS pro- 
teoglycan and can reconstitute the biological action of basic 
FGF in cells which lack the requisite HS proteoglycan (33—36). 
The addition of soluble heparin or heparan sulfate at concen- 
trations ranging from 0.1 ng/ml to 10 ug/ml into the culture 
medium did not reconstitute the mitogenic response in chlo- 
rate-treated MCF-10A cells exposed to AR. 2 This strongly sug- 
gests that the HS chain(s) that are essential to AR-triggered 
activation of the EGFR need to be tethered to the cell surface. 

Unlike EGF and transforming growth factor-a (TGF-a), but 
similar to HB-EGF, AR contains a very basic NH 2 -terminal 
extension, relative to its EGF-like domain (1). The finding that 
a synthetic peptide corresponding to this region of AR (residues 
26-44) binds to heparin -agarose and can compete for the bind- 
ing of AR to heparin-agarose strongly suggests that this region 
of AR is at least partially involved in the interaction with HS 
which is required for AR-induced mitogenic signaling by the 
EGFR. Consistent with this observation is the fact that a ho- 
mologous region in HB-EGF appears to be directly involved in 
the interaction between HB-EGF and heparin (50) and HS 
appears to be necessary for HB-EGF stimulation of smooth 
muscle migration (39). Therefore, it is plausible that mechanis- 
tically, HB-EGF may function in a manner similar to AR. 
Whether HS GAG is required for all the biological responses 
elicited by AR and HB-EGF remains to be seen. 

Also contained within residues 26-44 of AR are two putative 
nuclear localization signals (1) and indeed, immunoreactive AR 
has been detected in the nucleus of cells in vitro and in vivo (6, 
7, 9-13). The addition of exogenous l2S I-AR to several human 
carcinoma cell lines results in a preferential association of ra- 
diolabeled AR with nuclei, relative to radiolabeled EGF (51). 
Further, expression of SDGF (rat AR) lacking the secretory 
signal peptide results in nuclear accumulation of SDGF (52). It 
is possible that this very basic NH 2 -terminal region of AR per- 
forms distinct functions depending upon whether AR is intra- 
cellular or extracellular. Lastly, the finding that an accessory 
molecule containing HS is needed for efficient AR action pro- 
vides a mechanism by which AR may act in a more specific 
manner, relative to EGF and TGF-a. Since the EGFR is ex- 
pressed on numerous different cell types in vivo, AR action, 
in contrast to that of EGF and TGF-a, may be specifically 
targeted to cells which co-express EGFR and the proper HS 
proteoglycan(s). 
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We have isolated the gene for a novel growth regulator, amphiregulin (AR), that Is evolutionarily related to 
epidermal growth factor (EGF) and transforming growth factor a (TGF-a). AR Is a blfunctional growth 
modulator: it interacts with the EGF/TGF-o receptor to promote the growth of normal epithelial cells and 
inhibits the growth of certain aggressive carcinoma cell lines. The 84-amino-acid mature protein is embedded 
within a 252-amino-acid transmembrane precursor, an organization similar to that of the TGF-a precursor. 
Human placenta and ovaries were found to express significant amounts of the 1.4-kilobase AR transcript, 
implicating AR In the regulation of normal cell growth. In addition, the AR gene was localized to chromosomal 
region 4ql3-4q21, a common breakpoint for acute lymphoblastic leukemia. 



Cell growth and differentiation are regulated in part by the 
specific interaction of secreted growth factors and their 
membrane-bound receptors. Receptor- ligand interaction re- 
sults in activation of intracellular signals leading to specific 
cellular responses. Epidermal growth factor (EGF), platelet- 
derived growth factor, insulin, insulinlike growth factor 1, 
colony-stimulating factor 1, and fibroblast growth factor all 
transmit their growth-modulating signals by binding to and 
activating receptors with intrinsic tyrosine kinase activity 
(reviewed in reference 30). Characterization of the physio- 
logic and chemical effects that result from the binding of 
EGF and transforming growth factor a (TGF-a) to the EGF 
receptor has served as a useful model for understanding 
receptor-ligand interactions, signal transduction, and the 
regulation of cell growth and oncogenesis (reviewed in 
reference 76). 

We have recently reported the purification and sequence 
analysis of a glycoprotein isolated from the conditioned 
media of 12-0-tetradecanoylphorbol-13-acetate (TPA)- 
treated MCF-7 cells (65, 66). The protein was termed am- 
phiregulin (AR) to reflect its bifunctional activities: it inhibits 
the growth of many human tumor cells, and it stimulates the 
proliferation of normal fibroblasts and keratinocytes. The 
secreted protein exists as a monomer of either 78 or 84 amino 
acids (aa), with the shorter form lacking the six N- terminal 
residues of the larger molecule. Sequence analysis reveals 
that AR has a region with striking homology to EGF (38%) 
and TGF-a (32%), yet it also has an N-terminal extension of 
43 aa composed primarily of very basic, hydrophilic residues 
(Lys, Arg, and Asn). In addition, AR has functional homol- 
ogy with this class of growth factors; it partially competes 
for binding of EGF to the EGF receptor and can supplant the 
need for EGF or TGF-a to maintain keratinocytes in culture. 
AR differs from EGF and TGF-a in that it fails to promote 
anchorage-independent growth of normal rat kidney (NRK) 
fibroblasts in the presence of TGF^ and inhibits the growth 
of certain tumor cells that proliferate in response to EGF or 
TGF-a. 

In this report, we describe the isolation and characteriza- 
tion of cDNA and genomic clones for human AR, the 
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transcriptional profile of the AR gene, and its chromosomal 
localization. Like EGF and TGF-a, AR is synthesized as a 
transmembrane precursor, with the secreted protein being 
released by proteolytic cleavage. AR is found in many of the 
same tissues (ovary, testis, and breast) and tumor types 
(squamous carcinomas and mammary adenocarcinomas) as 
TGF-a, but analysis of individual tumor cell lines reveals an 
inverse correlation between expression of TGF-a and AR. In 
addition, AR transcription is stimulated in several human 
breast cancer cell lines after treatment with TPA. These 
findings suggest that AR has novel growth-regulatory activ- 
ities on both normal and neoplastic cells. 

MATERIALS AND METHODS 

Cell culture. All cells were obtained from the American 
Type Culture Collection. MCF-7 cells were maintained in 
50% Iscove's modified Dulbecco medium-50% Dulbecco 
modified Eagle medium containing 10% heat-inactivated 
fetal bovine serum and 0.6 u,g of insulin per ml. All other cell 
lines were grown in Dulbecco modified Eagle medium sup- 
plemented with 10% fetal bovine serum. 

cDNA cloning. Total cellular RNA was isolated from 
MCF-7 cells after treatment with 100 ng of TPA per ml for 
24, 40, and 72 h by the guanidinium method. Poly(A) + RNA 
was isolated from pooled aliquots of these samples. First- 
strand cDNA synthesis was performed on 5 of poly(A)**" 
RNA primed with oligo(dT) essentially as described by 
Gubler and Hoffman (29). Second-strand synthesis was 
performed with 4U of RNase H and 115 U of DNA polymer- 
ase I. T4 DNA polymerase (10 u.g) was used for removal of 
3' overhangs, creating blunt ends. Double-stranded cDNA 
was sized over a Sephadex G-50 column to select for cDNAs 
longer than 500 base pairs (bp), and then 150 ng of cDNA 
was dG tailed with terminal deoxy nucleotidyl transferase. 
dG-tailed cDNA was ligated into the EcoRl site of XgtlO by 
using the BR1 adapters (AATTCCCCCCCCCCCC) as de- 
scribed by Rose et al. (59). Duplicate nitrocellulose lifts were 
taken on 2.5 x 10 5 recombinants and filters were probed with 
best-guess (ARK31 and ARK41) and degenerate (ARD41 
and ARD58) oligonucleotides derived from the human am- 
phiregulin protein sequence (66). The oligonucleotide probes 
(including their degeneracy or length and corresponding 
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amino acid residues) which were used for screening the 
XgtlO library were as follows: ARD41, 5-TAYTCYTGYTG 
RCAYTTRCA-3' (64-fold, CKCQQEY); ARD58, 5'-TGYT 
CAATRTA YTTRC A YTC-3 ' (32-fold, ECKYIEH); ARK41, 
5'-TCGCCACACCGCTCGCCAAAGTACTCCTGCTGGCA 
CTTGCAGGTCACAGCCTCCAGATGCTCAATGT-3' (67- 
mer, YIEHLEAVTCKCQQEYFGERCGE); ARK31, 5'-TT 
GCACTCGCCATGGATGAGAAGTTCTGGAACTCAGCA 
TTGCATGGGTTCTT-3' (53-mer, KNPCNAEFQNFCIHG 
ECK). The probes all correspond to the antisense mRNA 
strand, and degenerate resides are R = A or G, Y = C or T. 
Oligonucleotides were labeled with [7- 32 P]ATP by using T4 
polynucleotide kinase (3 x 10 8 cpm/u.g). The XgtlO cDNA 
library was first screened with a mixture of degenerate 
probes (ARD41 plus ARD58) and later reprobed with a 
mixture of nondegenerate probes (ARK31 plus ARK41). 
Hybridization was performed in oligonucleotide hybridiza- 
tion mix (6x SSC [lx SSC is 0.15 M NaCl plus 0.015 M 
sodium citrate], 5x Denhardt solution, 0.15% sodium PPj, 
0.1 mg of salmon sperm DNA per ml, 0.1 mg of tRNA per 
ml) at 37°C overnight. Washes were done in 6x SSC at 37°C. 
Under these conditions ARK31 failed to show a reproducible 
signal. 

DNA sequence analysis. The 1.25-kilobase-pair (kb) cDNA 
insert of XAR1 was sequenced on both strands by the 
dideoxy -chain termination method (61) with specific oligo- 
nucleotide primers. Additional cDNAs were used to confirm 
the sequence of the coding region. Genomic DNA spanning 
the entire cDNA sequence, intron-exon junctions, and the 5' 
regulatory region was also sequenced. No differences were 
detected between the genomic and cDNA sequences. 

Northern (RNA) and Southern blot analyses. RNA was 
isolated from subconfluent cells grown in T150 tissue culture 
flasks or from fresh-frozen tissue samples. Total RNA (10 or 
20 u.g) was fractionated on 1.0% agarose-formaldehyde gels, 
transferred to nylon membranes (Amersham Hybond-N), 
and UV cross-linked (1,200 on Stratalinker; Stratagene). 
Probes were prepared by random-prime 32 P-labeling (spe- 
cific activity, 5 x 10 8 to 25 x 10 8 cpmVg) of a 480-bp cDNA 
fragment (ARBP1) spanning the entire coding region of 
mature AR or a 170-bp cDNA fragment (AR170) encoding 
the transmembrane and cytoplasmic domains of AR (23). 
Hybridizations were performed in 5x SSPE (lx SSPE is 
0.18 M NaCl, 10 mM sodium phosphate [pH 7.7], plus 0.1 
mM EDTAMx Denhardt solution-0.5% sodium dodecyl 
sulfate (SDS)-20 u.g of denatured salmon sperm DNA per ml 
at 42°C for 16 h with 2 x 10 6 cpm of the [ 3i P] ARBP1 per ml. 
Blots were washed several times in 2x SSC-0.1% SDS at 
65°C and then in lx SSC-0.1% SDS at 65°C and exposed on 
Kodak X-OMAT with two Du Pont Cronex Lightning Plus 
intensifying screens at -70°C. 

Genomic DNAs were isolated from subconfluent cells in 
T150 tissue culture flasks. DNA (20 jxg) was digested, 
analyzed on 0.8% agarose gels (SeaKem GTG), and blotted 
onto nylon membranes (Hybond-N). Filters were hybridized 
overnight at 42°C in Southern hybridization buffer (6x SSC, 
5x Denhardt solution, 0.5% SDS, 20 |xg of denatured salmon 
sperm DNA per ml) containing 2 x 10 6 cpm of 32 P-labeled 
AR-specific fragment per ml. Filters were washed exten- 
sively in lx SSC-0.1% SDS at 65°C and autoradiographed 
overnight at -70°C. 

Genomic cloning. MCF-7 DNA was digested with //i/idlll 
and electrophoresed on 0.8% low-gel-temperature agarose 
(Bio-Rad Laboratories). DNA was extracted from the agar- 
ose fractions spanning the 12- and 6.4-kb range and inserted 
into the tfindlll site of the XL47.1 vector (45). Nitrocellulose 



plaque lifts were hybridized overnight at 42°C in Southern 
hybridization buffer containing 10% dextran sulfate. Filters 
were washed in 2x SSC-0.1% SDS at 65°C and then in 0.5 x 
SSC-0.1% SDS at 65°C. 

Primer extension analysis. Synthetic oligonucleotides 
AR(CP) and AR(AP) complementary to the AR 5' cDNA 
sequence were 32 P end labeled with T4 polynucleotide 
kinase to a specific activity of 2 x 10 8 to 5 x 10 8 cpm/jig. 
Labeled oligonucleotide (10 6 cpm) was used to prime first- 
strand cDNA synthesis on 50 \Lg of MCF-7 RNA. The 
products were treated with RNase A, extracted with phenol 
and chloroform, ethanol precipitated, and analyzed by elec- 
trophoresis on standard 8% polyacrylamide-7 M urea se- 
quencing gels. The sequences (and positions as numbered in 
Fig. 2) of the AR-specific oligonucleotides used for primer 
extension analysis are as follows: AR(AP), 5'-GCGGCG 
CCTCGGGCTGTCCCG-3' (-151 to -171); AR(CP), 5'-CC 
GCTCTCG A AGGCTTGGGG AG-3 ' (-114 to -135). 

Plasmid constructions. For the promoter assays, plasmid 
pARS5, containing the 726-bp EcoRhSstl fragment from the 
5' end of the AR gene, was constructed (see Fig. 2). pARS5 
was digested with Smal and Sph\ y blunted with SI nuclease, 
and religated to generate pARS5Sm, which has the Smal site 
(171 nucleotides upstream of the AR-initiating ATG) of the 
AR 5' untranslated region (5'UTR) adjacent to a Hindlll 
site. The 693-bp EcoRI-fftVtdlll fragment was isolated from 
pARS5Sm and ligated into the HindlU site of the expression 
vector pS V0CAT by using a double-stranded oligonucleotide 
linker, ELNK3. The linker provides a ////idlll-fcoRI 
adapter with internal Kpnl, Aatll, and Sail sites to be used 
for generating exonuclease III deletions (33). The resulting 
expression construct, pXARElCAT, contains 648 bp of AR 
5'-flanking sequences, the cap site, and an additional 40 bp of 
exon 1 of the AR gene fused to the chloramphenicol acetyl- 
transferase (CAT) gene. This plasmid was then linearized 
with Kpnl-Satl, and constructs containing sequentially 
smaller amounts the AR 5' region were generated by using 
the exonuclease III digestion method (33). The inserted 
DNA and flanking regions were verified by sequence analy- 
sis. 

CAT assays. MCF-7 cells (2 x 10 6 ) were plated in 10 ml of 
medium in a 100-mm dish 12 to 16 h before transfection. The 
cells were transfected with 20 p.g of calcium phosphate- 
precipitated supercoiled plasmid DNA, and after 4 h were 
subjected to a 25% glycerol shock for 90 s. They were then 
fed 20 ml of fresh medium containing 0 or 100 ng of TPA per 
ml. At 40 h after transfection, the cells were washed, 
collected, and lysed by sonication in 100 |xl of 0.25 M Tris 
hydrochloride (pH 7.8). CAT activity was assayed essen- 
tially as described previously (27). Cell extract (3 to 50 u,l) 
was added to 2.5 mCi of [ U C] chloramphenicol (Du Pont, 
NEN Research Products) in a 150- |il reaction volume con- 
taining 0.5 M Tris (pH 7.8) and 0.5 mM acetyl coenzyme A. 
The reactions were incubated at 37°C for 2 h, extracted with 
1 ml of ethyl acetate, and developed on silica gel thin-layer 
chromatography plates with CHCl 3 -l-butanol (95:5). The 
thin-layer chromatography plates were dried and autoradio- 
graphed. The acetylated and unacetylated [ "^chloramphen- 
icol was quantified in a scintillation counter. CAT enzymatic 
activity was calculated as micrograms of chloramphenicol 
acetylated per hour per milligram of protein in the cell 
extract. 

Chromosomal localization. Plasmid pAR9, containing the 
complete AR cDNA sequence except for 100 bp from the 
3'UTR, was random-prime labeled by using 3 H-nucleotides 
to a specific activity of 3 x 10 7 cpm/p.g (23). In situ 
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hybridization to metaphase chromosomes from lymphocytes 
of a normal male donor was performed with AR probe 
concentrations of 10 to 30 ng/|xl (49). The slides were 
exposed for 3 to 4 weeks, and chromosomes were identified 
by Q- banding. 

RESULTS 

AR cDNA. Amphiregulin was initially identified and se- 
quenced from the supernatant of MCF-7 cells following 2 to 
3 days of treatment with TPA (66). This human breast 
carcinoma cell line was then used as a source for cloning the 
AR cDNA. Two XgtlO clones were plaque purified based on 
positive hybridization to oligonucleotide probes derived 
from the AR protein sequence. The clones (XAR1 and XAR2) 
each contained a single 1.3-kb EcoKL insert and were se- 
quenced with a degenerate oligonucleotide (ARD58) to ver- 
ify that they specifically encoded a protein whose sequence 
matched that of a major portion of the AR protein. 

A 170-bp fragment (AE170) from the AR cDNA clone was 
used to probe a second cDNA library of 200,000 recombi- 
nants. Thirteen positive recombinants were identified. The 
inserts ranged from 300 bp to 1.3 kb; six were longer than 1 
kb, and five contained a single Sstl site known to be within 
100 bp from the 5' end of the longest cDNA clone. Four of 
these larger inserts were subcloned for further restriction 
and sequence analysis. All clones had identical restriction 
maps based on Bsml, EcoRV, Pvull, Sstl, and Smal diges- 
tions, except one, which had a 3' truncation of 100 bp and 
was later found to originate from an A 5 track, presumably 
sufficient for priming with oligo(dT). 

Exact oligonucleotide primers were used to sequence both 
strands of the 1,230-bp AR cDNA (Fig. 1A). An open 
reading frame of 965 bp begins at nucleotide 1. The first 
AUG, at position 210, does not conform with the optimal 
consensus for translation^ start sites (37) owing to lack of a 
purine at position -3, although the second methionine 
(position 378) does match this consensus. The true start site 
is thought to be at the first AUG, since it is followed by a 
predicted 19-aa stretch of predominantly hydrophobic resi- 
dues typical of a signal peptide sequence (72). 

The cDNA encodes a protein precursor of 252 aa with a 
210-bp 5'UTR. The translation^ termination signal (TAA) at 
position 757 is followed by a 262-bp 3'UTR. Comparison of 
the cDNA sequence with the nondegenerate probes showed 
75% (ARK41) and 77% (ARK31) overall homology. Neither 
probe had a consecutive match of more than 8 nucleotides; 
however, 50 of 67 aligned nucleotides were sufficient to 
produce a detectable signal with ARK41 under conditions of 
low stringency. The codon usage by human AR mRNA 
sequence differs considerably from the usage frequencies 
reported by Lathe (41) and explains why the degenerate 
probes ARD41 and ARD58 showed stronger hybridization 
than the longer, preferred codon usage probes* 

Hydropathy analysis of the AR precursor sequence re- 
vealed two hydrophobic domains and one extended hydro- 
philic stretch (Fig. IB). The hydrophilic region is of notable 
length and magnitude, scoring below -4.0 with the algorithm 
of Kyte and Doolittle (39). Six structural domains are 
predicted in the 252 residue AR precursor: a 19-aa signal 
sequence (aa 1 to 19); an 81-residue amino-terminal domain 
(aa 20 to 100) which is serine rich (17 of 81 aa); an 84-residue 
region encoding the mature AR (aa 101 to 184), the first 43 aa 
being the hydrophilic domain and the last 41 aa showing 
homology to EGF and TGF-a; a hydrophobic, 23-residue 
putative transmembrane domain (aa 199 to 221) flanked by 



basic residues; and a 31-residue carboxy-terminal cytoplas- 
mic domain (aa 222 to 252). The unglycosylated AR precur- 
sor minus the signal peptide has a predicted molecular 
weight of 25,942. 

The 78- and 84-aa forms of mature AR are synthesized as 
the middle portion of a 252-aa transmembrane precursor. 
The cDNA sequence confirms the AR peptide sequence 
except for aa 113, which was sequenced as Asp (D) by 
protein analysis and was translated as Asn (AAC = N) from 
the cDNA sequence. The coding region was verified from 
three additional cDNA clones, and five cDNAs were all 
found to have their 5' end within 25 bp of the longest clone. 
The cleavage sites for the release of the 78- and 84-aa forms 
of this growth factor do not correspond to those of known 
proteases, although a basic amino acid is located two resi- 
dues downstream of both the N- and C-termini. Cleavage 
occurs between Asp-Asp and Ser-Val or between Glu-Gln 
and Val-Val at the amino termini and between Gln-Lys and 
Ser-Met at the carboxy terminus. 

AR gene. Southern analysis of MCF-7 DNA digested with 
Hindlll, EcoKL, or BamYil showed single bands (12, 8, and 
>20 kb, respectively) when hybridized with a probe contain- 
ing the 5' portion of the AR cDNA (nucleotides 1 to 670). 
The absence of multiple bands suggests that AR is a single- 
copy gene. A further 3' cDNA probe (nucleotides 681 to 850, 
spanning the transmembrane and cytoplasmic domains) hy- 
bridized to the same fragments and to an additional 6.4-kb 
Hindlll fragment. These results indicate that the mature AR 
coding region is split between two Hindlll fragments. Sim- 
ilar banding was seen on digests from human placenta, brain, 
melanoma (SK-MEL 28), choriocarcinoma (JEG-3), and 
epidermoid carcinoma (A-431) DNA, suggesting that there 
are no gross rearrangements or amplifications of the AR 
gene. 

The two Hindlll fragments were cloned from MCF-7 
DNA, since these were likely to contain most, if not all, of 
the AR gene and flanking sequences. Of nine positive clones, 
two (XARH12 and XARH6) were selected for more detailed 
characterization. The 12- and 6.4-kb inserts were then sub- 
cloned and mapped for several restriction sites. The se- 
quences of the exons and adjacent intron regions were 
determined with exact oligonucleotide primers combined 
with direct sequencing of smaller subclones. The genomic 
sequence confirmed the transcribed sequence that was de- 
termined from the cDNA clones (Fig. 2). 

The human AR precursor is contained in six exons, 
spanning 10.2 kb of genomic DNA (Fig. 2 and 3). The exons 
vary in size from 112 to 270 bp and are interrupted by intron s 
ranging from 1.25 to 2.1 kb. The intron-exon boundaries of 
the AR gene all conform with the canonical splice consensus 
sequence. The five introns of AR interrupt the coding 
sequence near the borders of the various protein domains 
(Fig. 3). Exon 1 encodes the 5'UTR and signal peptide; exon 
2 encodes the N-terminal precursor; exon 3 encodes the very 
basic and hydrophilic N-terminal portion of AR as well as 
the first two loops of the EGF-like region; exon 4 comprises 
the third loop of the EGF-like motif and the transmembrane 
domain; exon 5 contains the cytoplasmic region; and exon 6 
represents the 3'UTR. There is no intron separating the 
hydrophilic domain from the EGF-like motif, nor are any 
cryptic splice sites apparent between these domains. This 
hydrophilic region is purine rich (55 of 59 residues are A or 
G) and has no repetitive sequences. The process by which 
this unusual N-terminal extension became juxtaposed with 
the EGF-like region remains unclear. 

Analysis of the AR 5' regulatory region. The recombinant 
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1 A6AC6TTCSCACACCT6G6T6CCAGCSCCCCA6A66T 
101 ACACTCCC66TCTCCACTCGCTCn(XAACACCC6 C TC6T^ 

MRAPLLPPAPVVLSLL1L65 6 H 
201 6AA66ACCA AT6 lAGA GCC CC6 CT6 CTA CC6 CC6 6C6 CC6 6T6 GT6 CT6 TC6 CTC TT6 ATA CTC G6C TCA 66C CAT 

30-CHO 40 »***t 

Y A A 6 I 0 L NOTYsbKREPF 5 ^ 0 H S A 0 
276 TAT SCT GCT GGA TTfi GAC CTC AAT GAC ACC TAC TCT G6G AAG C6T GAA CCA TTT TCT G66 GAC CAC AST 6CT GAT 

50 1# **§° 70 

6FCVTSRSEMS56SEISPVSCNPSS 
351 GGA TTT GAG 6TT ACC TCA AGA ACT GAG AT6 TCT TCA G6G ACT GAG ATT TCC CCT 6T6 A6T GAA ATG CCT TCT A6T 

•»**• 80 ♦ ♦ ♦ 90 

SCPSS6ADYDYSCEY0NEPQIPGY I 
426 AGT GAA CCG TCC TC6 GGA GCC GAC TAT GAC TAC TCA GAA GAG TAT GAT AAC GAA CCA CAA ATA CCT GGC TAT An 

100 1 110 CHO — CHO 

VDOSVRVE 0 T V VICPPgilKTESEIITSD 

501 GTC gat GAT TCA. 6TC MA fiTT 6AA CA6 fiTA 6TT KC CCC \M MC M6 Att BM_agt_6AJ MT ACT TC*. 6AJ 

130 m 

KPKRKKKGGKNGKNRRNRKKKNPCII 
576 AAA CCC AAA AGA AAG AAA AAG GGA GGC AAA AAT GGA AAA AAT A6A A6A AAC A6A AAG AAG AAA AAT CCA TGT AAT 

150 160 170 

AEF0NFC1H6CCKTIEHLEAVTCKC 
651 GCA GAA TTT CAA AAT TTC TGC ATT CAC GGA GAA TGC AAA TAT ATA GAG CAC CTC GAA GCA GTA ACA TGC AAA fGT 

160 190 
OOEVFGERCGEKSNKTHSMIDSSLS 
726 CA6 CAA 6AA TAT TTC G6T GAA CS6 TGT GG6 GAA AAG TCC AT6 AAA ACT CAC AGC ATG ATT GAC AGT AGT TTA TCA 

200 210 220 

KIALAAIAAFHSAVIITAVAVITVQ 
801 AAA An GCA TTA GCA GCC ATA GCT GCC TTT AT6 TCT GCT 6T6 ATC CTC ACA GCT 6TT GCT 6TT ATT ACA GTC CAG 

230 240 
IRROYVRKrEGEAEERICICLRgENGN 
876 CTT AGA AGA CAA TAC 6TC AGG AAA TAT GAA GGA GAA GCT GAG GAA CGA AAG AAA CTT CGA CAA GAG AAT GGA AAT 

250 

V H A 1 A 

951 GTA CAT GCT ATA GCA TAACTGAAGATAAAACTACAGGATATCACAn6GAGTCACTGCCAA6TCATA 
1046 CAGTG6ATCATAA6ACMT66ACCCTTTn6nAT6ATGGTTTTAAACTn 
1146 AAAA6TAnHTTCAA6TT6TAAATAATnATTTAATATnAAT66AAST6TAT 
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FIG. 1. Nucleotide sequence of the human AR cDNA and predicted amino acid sequence of the AR precursor. (A) Nucleotide numbering 
is on the left, and amino acid numbering is above every tenth residue. The 84 residues corresponding to the mature secreted AR protein are 
underlined, the predicted hydrophobic signal peptide is underscored with a dashed line, and the transmembrane sequence is marked with a 
double underline. An arrow marks the alternate N-terminal cleavage site for mature AR. Potential N-linked glycosylation sites are denoted 
by -CHO-, potential N-glycosaminoglycan attachment sites are identified by asterisks, consensus tyrosine sulfation sites are shown with a 
diamond, and basic residues conforming to a nuclear localization sequence are overscored with a dotted line. An exact AATAAA 
polyadenylation splice site is not present in the 3'UTR; however, a common polyadenylation site, CAGCT, is located 23 bp preceding the 
poly(A)* tract, indicated by the double underscore. Also underscored in the 3'UTR are the mRNA-destabilizing consensus motifs (ATTTA) 
characteristic of several cytokines and lymphokines (64). (B) Hydropathy profile of the predicted amino acid sequence of human AR precursor 
based on the algorithm of Kyte and Doolittle (39). 



phage XARH12 contains 6.5 kb of genomic DNA 5' to the 
cDNA clones. To better understand factors that govern the 
expression of AR, we localized the 5' end of the AR gene by 
using primer extension analysis. Endogenous AR transcripts 
were characterized by using mRNA isolated from TPA- 
stimulated MCF-7 cells. Primer extension revealed three 



major transcriptional start sites between positions —211 and 
-209 (Fig. 4A), with the first located just 1 bp upstream of 
the longest cDNA clone (Fig. 2). A second oligonucleotide 
primer confirmed the extension products at positions -211 
and -210. 

To determine whether the AR 5 '-flanking sequences con- 
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-858 GaS^CATATCCACCTGGCTTTGAACA 

-758 AATGTTGATGAGGCCCCTGGGCCACATAAAGAAATAGGGAGTGAGGGGATT^^ 

Seal 

-658 AGATAGAAGCCCATCCTACATGAAGCMTTCCTCATTGAGTTCTCTCGTCCTTTATCCTTGTTGGAAACATCAGGCAAAGTCA 
-558 CTTTTACATCTAMTACGGAACTCTTCTATTrAATCCCTGTCTGTC 

Ddei 

-458 TCAGGTACTAGCTCCGGAACTCCAGTCCTGCTCGCCCTCAAAMCGGCTTGCAGCTAGAGGTTTAAGITCCACTTCCTCTCAGCGAATC^ 
-358 GGGAt^Gt^GCGTGTGTCCTCCGCGCGTCGTTTTCGGGTAGCACCnCTGGGGCa 
-258 GGCCCCCTCCCGG?I^GCC TATAAA CCGGCAGCTGCGCGCC 

-158 AGGCOXGCGCCCGCCGCCCC 

I MRAPLLPPAP 
-58 TCGTGTCCCAGAGACCGAGTTGCCCCAGAGACCGAGACGCCGCCGCTGCGAAGGACCA ATG AGA GCC CCG CTG CTA CCG CCG GCG CCG 

11VVLSLLILGSG 

31 GTG GTG CTG TCG CTC TTG ATA CTC GGC TCA Ggtgaggattcaacggcgctgaactgctgggctctcctcccatggcaggt . . (2. 1 kb). 

22 HYAAGLDLNDT 
62 ...agcaccctactttaccttttcgttttcttcctttattccctcccctgcagGC CAT TAT GCT GCT GGA TTG GAC CTC AAT GAC ACC 

33 Y SGKREPFSGDHSADGFEVTS RS EM 
97 TAC TCT GGG AAG CGT GAA CCA TTT TCT GGG GAC CAC AGT GCT GAT GGA TTT GAG GTT ACC TCA AGA AGT GAG ATG 

58SSGSE1 SPVSEHPSSSEPSSGADYO 
172 TCT TCA GGG AGT GAG ATT TCC CCT GTG AGT GAA ATG CCT TCT AGT AGT GAA CCG TCC TCG GGA GCC GAC TAT GAC 

83 YS E E YDNEPQI PGY I VDDSVRV 

247 TAC TCA GAA GAG TAT GAT AAC GAA CCA CAA ATA CCT GGC TAT ATT GTC GAT GAT TCA GTC AGA G gtgagtaggggataa 

311 agcaaaaatatggcctgtgagatgtgggtttata. .(1.4 kb) . .aattatattcaagtttgagagactcttgtcaataaatcttttcttttttagTT 

105 EQ^VVKPPQNKTESENTSOKPKRKKK 
313 GAA CAG GTA GTT AAG CCC CCC CAA AAC AAG ACG GAA AGT GAA AAT ACT TCA GAT AAA CCC AAA AGA AAG AAA AAG 

130 GGKNGKN RRNRKKKNPCNAEFQNFC 
388 GGA GGC AAA AAT GGA AAA AAT AGA AGA AAC AGA AAG AAG AAA AAT CCA TGT AAT GCA GAA TTT CAA AAT TTC TGC 

155 IHGECKYIEHLEAVTCK 

463 ATT CAC GGA GAA TGC AAA TAT ATA GAG CAC CTG GAA GCA GTA ACA TGC A qtaaqttttcctaaagcatatagatttttgtattt 

172 C Q Q E 

512 ctagcaccatgtctg. . . . (1 .25 kb) . ■ .cacaccgcacgtgagtgtgattataatttttaaatgtgaattgcttgcag AA TGT CAG CAA GAA 

176 YFGERCGEKSHKTHSMIDSSLSKIA 
526 TAT TTC GOT GAA CGG TGT GGG GAA AAG TCC ATG AAA ACT CAC AGC ATG ATT GAC AGT AGT TTA TCA AAA ATT GCA 

201 LAA I AAFMSAV I LTAVAV I TVQ 

601 TTA GCA GCC ATA GCT GCC TTT ATG TCT GCT GTG ATC CTC ACA GCT GTT GCT GTT ATT ACA GTC CAgtaagtatgacata 

665 acttacaaattcttaataaaataatgggaggttaat. . . (2.0 kb) . . .tatagatgaatagaaccttgataacattagaatgccttgttctctgaagG 

223 LRRQYVRKYEGEAEERKKLRQENGN 

666 CTT AGA AGA CAA TAC GTC AGG AAA TAT GAA GGA GAA GCT GAG GAA CGA AAG AAA CTT CGA CAA GAG AAT GGA AAT 

248 V H A I A 

742 GTA CAT GCT ATA GCA TAA CTGAAGATAAAATTACAGgtttgagttttaaaatatatctttagatcatatcctataattttgaaaaatttaac. . 

. . (2.0 kb) . . .gtaacattttgttttattttattattttattttattttattttctcacagGATATCACATTGGAGTCACT GCCA AGTCATAGCCATA 
AATGATGAGTCGGTCCTCTTTCCAGTGGATCATAAGACAATGGACCCTTTTTGTTATGA 
TATATAMGGTGCACGAAGGTAAAAAGTATTTTTTCAAGTTGTAAATAATTTAn 
TTTAACCAAAcaaattgagagtttgaatattagttctgatattgcaagactccagtgtacttttctc 

FIG. 2. Genomic and primary amino acid sequence of human AR. The sequence was determined from the two genomic HindlU fragments 
isolated from MCF-7 cells. The six exons are shown in capitals with introns and flanking sequences in lowercase letters. Nucleotide and 
protein sequences are numbered on the left, and the lengths of unsequenced intron regions are shown in parentheses. The mature AR coding 
region is underlined. The sequence of the AR promoter is also shown with numbering in reverse up to the EcoKl site at position —859. 
Selected restriction sites are shown as reference points. The furthest 5' end of the mRNA is marked with a forward arrow, and an upstream 
TATA sequence is indicated by a double underline. An Spl consensus sequence and cAMP-responsive element (CRE) are underlined. 



tain an active promoter, a series of vectors were constructed 
that contain the bacterial CAT gene under the control of 
various fragments from the AR 5'-flanking region (Fig. 4B). 
A transient-expression assay was used to assess promoter 
strength after transfection of the CAT vectors into MCF-7 
cells (27). Plasmid pXARElCAT, containing the 688-bp 
EcoRI-Smal fragment from AR 5'-flanking sequences fused 
to CAT, showed detectable promoter activity that increased 
five- to eightfold in response to 40-h treatment with TP A. A 
similar plasmid, pXAREldCAT, containing 167 bp of AR 



5 '-flanking sequences, showed activity comparable to 
pXARElCAT and was also TPA inducible (Fig. 4B and C). 
Nuclear runoff experiments confirm three- to fivefold-in- 
creased transcription of AR in MCF-7 cells in response to 
TPA (data not shown). Taken together, these findings func- 
tionally identify a TPA-responsive promoter in the 5'- 
flanking region of AR gene. 

The nucleotide sequence of an 859-bp genomic fragment 
derived from plasmid pARE6 is presented in Fig. 2. The 
155-bp sequence immediately upstream of the 5' end of the 
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FIG. 3. Map of the human AR gene showing the exon organization and protein domains. The gene is drawn to scale in a 5'-to-3' 
orientation. The six exons are shown (■), with the length in base pairs being listed directly under each exon. Intron lengths in kilobase pairs 
are indicated between each exon. Cleavage sites for 10 selected restriction enzymes are shown. Asterisks denote sites which are present in 
the cDNA. The corresponding position of each exon about the AR mRNA is shown above on a 15- fold- larger scale. Protein domains are 
represented by shaded boxes. The number of amino acid residues in each domain is indicated. The two dark filled boxes represent 
hydrophobic stretches that correspond to the signal peptide and transmembrane (TM) domains. Mature AR is represented by two boxes, the 
N-terminal hydrophilic domain and the C-terminal EG F- 1 ike motif. 
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FIG. 4. Structural and functional characterization of the AR 5' regulatory region. (A) Primer extension analysis of AR mRNA. AR- specific 
primers were hybridized to 50 |*g of total cytoplasmic RNA from MCF-7 cells (lane 1) or MCF-7 cells treated with TP A for 24 h (lanes 2 to 
4). Lanes 1 and 2 contained primer AR(CP) at 5 x 10 5 cpm per reaction, whereas lanes 3 and 4 contained primer AR(AP) at 5 x 10 4 or 5 x 
10 3 cpm per reaction, respectively (see Materials and Methods). cDNA was extended and subjected to electrophoresis on an 8% 
polyacrylamide-7 M urea gel. A dideoxy-sequencing ladder using primer AR(CP) on plasmid pARE6 is shown in lanes 5 to 8 (CTAG). Dots 
label the extended products, which were 96 and 97 nucleotides in lane 2 and 59, 60, and 61 nucleotides in lanes 3 and 4. Extensions from two 
separate primers concur on positions —211 and —210 as the AR transcriptional start sites. (B) Chimeric AR-CAT constructions. pARE6 is 
an AR genomic clone containing exon 1 and flanking sequences. Exon 1 is shown (CD), with an arrow indicating the direction of transcription, 
and the coding region is also shown ( ). A 689-bp EcoRl-Smal fragment containing 41 bp of AR exon 1 and 5 '-flanking sequences was 
inserted in front of the CAT gene (M) to generate pXARElCAT. Exonuclease III (33) was used to delete 522 bp from the 5' end of this 
construct, retaining 167 bp of AR 5 '-flanking DNA in plasmid pXARE ldC AT . (C) Induction of CAT enzyme activity in MCF-7 cells following 
transfection with pXARElCAT (lanes 1 and 2) or pXAREldCAT (lanes 3 and 4) and treatment with TPA for 40 h (lanes 2 and 4). An 
autoradiogram of a thin-layer chromatography plate is shown, with the acetylated products migrating above the unacetylated chloramphen- 
icol. Above each lane is the amount of chloramphenicol (micrograms) acetylated per picogram of protein per hour. These results have been 
replicated four times. 
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FIG. 5. The AR gene is located on human chromosome 4 band ql3. (a) Distribution of 66 sites of hybridization on metaphase spreads of 
human chromosomes, (b) Regional localization of the AR gene on human chromosome 4 at 4ql3-21. Shown is a histogram of 17 silver grains 
localized to chromosome 4. A nonrandom distribution of grains was observed, with a peak at bands 4ql3-21. 



mRNA has a high G+C content (73%) and a single Spl- 
binding site. A consensus TATA sequence resides 28 bp 
upstream of the first mRNA start site, but no CAAT box is 
apparent. A potential cyclic AMP (cAMP)-responsive ele- 
ment (5'-TGACGTCA-3' [19]) is present beginning 64 bp 
upstream of the mRNA, and studies are under way to assess 
its significance. A consensus sequence often found in TPA- 
inducible genes [5'-TGA(G/C)TCA-3'] is not present in this 
region, suggesting that TP A may stimulate AR gene expres- 
sion through a nonclassical pathway (10). 

Chromosomal location of the AR gene. The AR gene was 
mapped on human metaphase chromosomes by in situ hy- 
bridization (49) with a cDNA clone as a 3 H-labeled probe. 
Forty-two metaphase cells were examined in this analysis. 
Of 66 sites of hybridization scored, 12 (18%) were located 
between bands ql3 and q21 of the long arm of chromosome 
4 (Fig. 5 A). The largest number of grains was at band ql3 
(Fig. 5b). There was no significant hybridization to other 
chromosomes. Localization was confirmed by polymerase 
chain reaction on hamster x human somatic cell hybrid 
DNA containing only human chromosome 4. Oligonucleo- 
tide primers derived from AR exon 3 and the flanking intron 
generated a 220-bp polymerase chain reaction fragment only 
in human DNA and the somatic cell hybrid DNA containing 
chromosome 4; in contrast, the Chinese hamster ovary 
(CHO) DNA was negative (data not shown). 

Cellular sources of AR transcripts. Northern analysis re- 
vealed a single 1.4-kb mRNA species for AR, which was 
detected in both total and poly(A) + RNA fractions from a 
variety of human tissues and tumor cell lines. The 1.4-kb 
transcript was most prevalent in RNA derived from human 
ovary and placenta and less abundant in pancreas, cardiac 
muscle, testis, colon, breast, lung, spleen, and kidney RNA 
(Fig. 6A). Little or no hybridization was seen in the adrenal, 
brain, duodenum, epidermis, liver, parathyroid, prostate, or 
thymus. AR-specific hybridization also was quantified by 
solution hybridization with 32 P-antisense RNA probes (35). 
This analysis revealed 3- to 10-fold-higher AR expression in 
the ovary and placenta than in the pancreas or spleen. 



Several human cell lines were examined for expression of 
AR based on the presence of the 1.4-kb AR-homologous 
transcript and on solution hybridization. AR was expressed 
in six human breast carcinomas (MCF-7, MDA-MB-361, 
T-47D, BT-474, MDA-MB-231, and SKBR-3), but not in 




FIG. 6. Northern blot analysis of AR expression in normal 
human tissues and tumor cell Lines. (A) Total RNA was isolated 
from the following human tissues and probed with an AR cDNA 
fragment. Lanes: 1, pancreas; 2, cardiac muscular; 3, testis; 4, 
placenta; 5, ovary; 6, epidermis; 7, duodenum; 8, colon; 9, breast; 
10, cerebral cortex; 11, adrenal. Lane 12 is radiolabeled X/Hindlll 
markers with sizes shown in kilo bases. All lanes contained 30 jig of 
total cytoplasmic RNA, except lanes 4, 5, and 9, on which 15 ug was 
loaded. The exposure time was 3 days at -70°C. (B) TPA induction 
of AR mRNA expression in tumor cell lines. Lanes 1 to 4 show a 
time course for TPA induction of AR in MCF-7 cells. RNA was 
isolated after 0 h (lane 1), 3 h (lane 2), 6 h (lane 3). or 24 h (lane 4) 
of exposure to 100 ng of TPA per ml. All other TPA treatments were 
for 24 h. Lane 5, MDA-MB-361 untreated; lane 6, MDA-MB-361 
plus TPA; lane 7, BT-474 plus TPA; lane 8, T-47D plus TPA; lane 9, 
CaKi-1 plus TPA; lane 10, AsPC-1 plus TPA; lane 11, X/Hindlll 
markers. RNA was loaded on lanes 1 to 6 at 10 \tg per lane and on 
lanes 7 to 10 at 20 m-8 per lane. 
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three others (MDA-MB-157, MDA-MB-468, and MDA-MB- 
453). AR was also detected in one of three kidney carcino- 
mas (CaKi-1), two of four pancreatic carcinomas (AsPC-1 
and Capan-1), and two of three ovarian tumors (CaOv-3 and 
SK-OV-3). Several embryonic (HEL-299, HEPM, JEG-3, 
HUF, and PA-1), melanoma (SK-MEL-28 and A-375), neu- 
ral (CRL 7386 and SK-N-MC), and hematopoietic (CEM, 
U-937, and HSB-2) lines were negative. The highest levels of 
AR expression were seen in breast carcinoma cell lines that 
are estrogen receptor positive and contain low levels of EGF 
receptor (MCF-7, MDA-MB-361, T-47D, and BT-474) (Fig. 
6B). TGF-a is also produced by many human tumor cells and 
has been detected in 70% of primary breast tumors (2); 
however, unlike AR, expression of TGF-a correlated with 
overexpression of the EGF receptor (17, 20). The four breast 
carcinoma cell lines that overexpressed AR (MCF-7, MDA- 
MB-361, T-47D, and BT-474) had low or undetectable levels 
of TGF-a (20, 52, 79; G. Plowman, unpublished observa- 
tion). We conclude that AR is overexpressed in several 
mammary tumor cell lines, although its regulation is quite 
different from that of TGF-a. 

The time course of AR induction by TPA was determined 
for MCF-7 and HBL-100 cells, the latter being a nonmalig- 
nant, estrogen receptor-negative, and progesterone recep- 
tor-negative breast cell line. RNA was isolated at 0, 3, 6, 24, 
and 48 h after TPA treatment and subjected to Northern and 
solution hybridization analysis. MCF-7 cells reproducibly 
showed an 11-fold increase in AR mRNA by 3 h with a 
maximum 18-fold increase by 24 h (Fig. 6B). In contrast, 
HBL-100 cells expressed detectable levels of AR only at the 
1-h time point (data not shown). Of 18 human cell lines, 9 (5 
of breast carcinoma origin) also showed increased AR 
expression in response to 24-h treatment with TPA. MDA- 
MB-361, a breast adenocarcinoma cell line with brain me- 
tastasis, showed the highest constitutive level of AR (21 pg 
of AR RNA per u.g of total RNA), whereas MCF-7 cells 
showed the highest level after TPA induction (109 pg of AR 
RNA per u,g of total RNA). 

Because the AR 3'UTR contains A+U-rich sequence 
elements common to a number of genes which undergo rapid 
turnover (64), the stability of AR mRNA was analyzed in 
both the presence and absence of TPA. MCF-7 cells were 
treated with 5 to 40 u.g of dactinomycin per ml, and cyto- 
plasmic RNA was extracted at 0, 1, 2, 4, and 6 h. The decay 
of AR mRNA was monitored by Northern analysis and 
solution hybridization assays. A 6-h treatment with the 
lower dose of dactinomycin had no effect on AR mRNA 
levels. With higher concentrations of the drug, the AR 
half-life was shown to be 3.5 h, in both the presence and 
absence of TPA. The requirement for higher dactinomycin 
doses is probably due to delayed drug uptake by MCF-7 ceils 
or to the inherent drug resistance common to many tumor 
cell lines (38). Despite the multiple ATTTA consensus 
sequences present in its 3'UTR, we conclude that the AR 
mRNA is quite stable in MCF-7 cells. Although similar 
motifs are reported to promote mRNA decay in normal 
hematopoietic cells, they may have a stabilizing effect in 
certain tumor lines (63). Multiple mechanisms are involved 
in the TPA stimulation of AR gene expression in MCF-7 
cells; these include a two- to eightfold-increased transcrip- 
tion and the subsequent accumulation of a relatively stable 
mRNA. Additional studies are under way to ascertain 
whether similar mechanisms modulate AR gene expression 
in other cell types. 



DISCUSSION 

Structural comparison of AR and EGF-like growth modu- 
lators. A search of the amino acid (Protein Identification 
Resource, release 21) and nucleotide sequence (EMBL, 
release 19; GenBank, release 60) libraries was performed 
with the AR sequence. Only the region spanning the mature 
growth factor showed significant homology with any of the 
sequences in the data bases. EGF was most closely related 
to AR, with identical residues at 16 of the 37 (43%) positions 
in the region spanning the three disulfide loops (62). In the 
same region, TGF-a (47) and AR have 12 residues in 
common, TGF-a and EGF have 18, and EGF and vaccinia 
virus growth factor (7) have 17 (Fig. 7). The homology 
between these growth modulators includes the conserved 
spacing of the six cysteines involved in three disulfide bonds 
that define their secondary structure. Only four additional 
residues are completely conserved between AR and the 
other mammalian and viral EGF-like growth factors (Gly-13, 
Tyr-32, Gly-34, and Arg-36 [Fig. 7]; note that positions 
referred to in the text are relative to the first cysteine in 
mature EGF, as shown in Fig. 7A). Most proteins that bind 
the EGF receptor also contain His- 11 and Gly-31; the 
exceptions are the viral EGF homologs from Shope fibroma 
virus and myxoma virus, which have Asn-11 and Glu-31 (9, 
71), and AR, which has Glu-31. Residue 31 of the EGF-like 
proteins is located within a predicted type II p-turn, where 
such substitutions may be considered conservative (74). 
Two-dimensional nuclear magnetic resonance studies of 
EGF and TGF-a (11, 50) suggest that the highly conserved 
spacing of cysteines and glycines may define the basic 
structure of the protein backbone while other conserved, or 
conservatively changed, residues may form the functional 
receptor recognition site (residues 8, 11, 36, 38, and 42). 
Further support for their involvement in receptor binding is 
provided by analyses suggesting that these residues may all 
lie on the same face of the molecule (8). 

Functional evidence that the conserved residues are nec- 
essary for biological activity has been obtained by charac- 
terization of derivatives of TGF-a and EGF. Recombinant 
proteins, synthetic peptides, site-specific chemical deriva- 
tives, and proteolytic degradation have all been useful for 
generating altered molecules (14, 31, 42). Some generaliza- 
tions include the following: (i) the six cysteines (positions 1, 
9, 15, 26, 28, and 37) and their disulfide loops (positions 1 to 
15, 9 to 26, and 28 to 37) and Arg-36 are required for 
biological activity; (ii) N- terminal extensions have little 
effect on activity; (iii) an aromatic residue (Phe, Tyr) is 
required at position 8; and (iv) a nonconservative change of 
Tyr-32, Asp-41, or Leu-42 results in loss of activity and/or 
dramatic loss in receptor binding and autophosphorylation. 

AR fits all but criterion (iv), since it truncates at position 
40 and consequently lacks the final two conserved residues 
(Asp-41 and Leu-42). Despite this, AR competes for EGF 
receptor binding and substitutes for EGF in some mitogenic 
assays (66). These carboxy-terminal differences may be 
responsible for nonsaturable receptor-binding kinetics and 
the functional differences between AR and EGF, including 
an inability to synergize with TGF-0 (66), marked differ- 
ences from EGF and TGF-a in the inhibition of certain cell 
lines, and differences in cross-linking and phosphorylation 
assays (data not shown). The extended hydrophilic region at 
the N terminus of mature AR may also account for the 
disparities in receptor-binding and biological activity. 

Structural domains that are superficially similar to EGF 
are found in diverse proteins, including several of the blood 
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FIG. 7. Protein sequence homologies between the EGF-Iike growth modulators. (A) Amino acid alignment of the EGF-like motif and 
flanking transmembrane domain from three human proteins and one viral protein known to bind the EGF receptor. Alignment and numbering 
begins at the first cysteine of these motifs, and the most nighty conserved residues are boxed. The putative transmembrane domains are 
underlined, and arrowheads mark the proteolytic cleavage sites where the mature growth modulators are released from their membrane-bound 
precursors. Exon-intron boundaries are displayed as facing arrows situated above the interrupted amino acids. The 3' junction of human 
TGF-a is inferred from that of the rat gene. Vaccinia virus growth factor (VGF) contains no introns. (B) Schematic diagram of the predicted 
secondary structure of human AR, EGF, TGF-a, and vaccinia virus growth factor. Symbols: •, residues which are completely conserved 
among all four proteins; @ , additional residues that EGF, TGF-a, or vaccinia virus growth factor have in common with AR; |||, hydrophilic 
residues in the N-terminal portion of AR. An arrow marks the alternate cleavage site for the 78-aa form of mature AR. 



coagulation factors; extracellular matrix proteins (laminin 
and tenascin); the invertebrate homeotic proteins including 
Notch, slit, and Delta from Drosophila melanogaster and 
lin-12 and glp-1 from Caenorhabditis elegans; low-density 
lipoprotein receptor and low-density lipoprotein receptor- 
related protein; and members of a family of cell adhesion 
proteins including the lymphocyte homing receptors and the 
endothelial leukocyte adhesion molecule ELAM-1 (reviewed 
in references 40, 66, and 77). Although structurally similar to 
EGF, none of these proteins maintain the precise spacing of 
all six cysteines, and, likewise, none have been shown to 
compete for binding to the EGF receptor. 

AR precursor. Mature, secreted AR is synthesized as the 
middle portion of a 252-aa transmembrane precursor. The 
AR precursor has three potential N-glycosylation sites, one 
in the N-terminal domain (position 30) and two in the 
hydrophilic region of the mature protein (positions 113 and 



119). The 78- and 84- aa forms of AR have predicted molec- 
ular weights of 9,173 and 9,772, respectively. N-Iinked 
glycosylation is known to contribute 10 to 12 kilodaltons to 
mature AR and is probably the result of carbohydrate 
addition at one or both of the sites in the hydrophilic domain. 

The N-termtnal region of the AR precursor contains 19 
tightly clustered serine and threonine residues, 18 acidic 
residues, and 6 prolines. These attributes are common to 
O-glycosylation sites (48) and suggest that the AR precursor 
may contain complex carbohydrates, a modification ob- 
served in some larger forms of the TGF-a precursor (6, 25, 
69). This domain also contains four Ser-Gly dipeptides that 
are potential sites for glycosaminoglycan attachment (26). 
Moreover, three potential tyrosine sulfation sites (Tyr-81, 
Tyr-83, and Tyr-87) are identified in the AR precursor based 
on the presence of a tyrosine residue surrounded by acidic 
amino acids and the absence of residues that could contrib- 
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ute to steric hindrance (1, 34). Residues 55 to 102 scored high 
in the PEST algorithm (58), which identifies protein regions 
rich in proline, glutamic acid, serine, and threonine residues. 
Proteins scoring high by this analysis typically undergo rapid 
degradation. 

The hydrophilic domain of the mature AR protein is 
composed of many positively charged amino acids (16 of 43 
residues are Lys or Arg), including two consecutive 
stretches of four or five basic residues (Fig. 1A). This region 
of AR is similar to the nuclear targeting signal of simian virus 
40 large T antigen that contains a characteristic KKKRK 
sequence preceded by small amino acids (Gly, Ala, and Pro) 
thought to favor the formation of an a-helical structure 
(reviewed in reference 57). Mutation analysis has revealed 
four consecutive basic residues as the predominant feature 
of the simian virus 40 nuclear localization sequence. Other 
proteins that contain similar nuclear targeting sequences 
include histones, steroid hormone receptors, c-abl, MyoDl, 
and c-myc. Preliminary data show that AR binds single- and 
double-stranded DNA under conditions in which EGF does 
not bind (M. Shoyab and G. Plowman, unpublished data). 
Production of AR-specific antibodies (24) will be useful for 
localization of internalized AR. Possibly AR mediates some 
of its effects by being targeted to the nucleus and interacting 
with the controlling regions of other growth-regulatory 
genes. 

Differential proteolytic cleavage of a transmembrane pre- 
cursor is thought to generate the two forms of mature AR (78 
and 84 aa). The presence of an intron between the predicted 
N-terminal cleavage sites suggested that intron sliding might 
account for this difference. MCF-7 cells produce 80% of 
their AR in the smaller (78-aa) form, whereas JEG-3, a 
choriocarcinoma cell line, produces most of its protein as the 
larger (84-aa) form (data not shown). To determine whether 
this difference was due to alterations at the DNA level, the 5' 
junction of AR exon 3 was isolated from three human cell 
lines (JEG-3, CaKi-1, and MCF-7) by using polymerase 
chain reaction techniques. Direct sequence analysis revealed 
no cell-type-specific differences at the genomic level and 
supports a model of differential proteolytic processing for 
generation of the two forms of secreted AR. 

Conservation of exon structure among AR, EGF, and 
TGF-a. The human AR gene is divided into six exons, 
spanning 10.2 kb (Fig. 2 and 3). Like AR, the TGF-a gene 
has six exons (16, 18), yet it spans nearly 10 times the length 
of the AR gene. The human EGF gene is also relatively 
large, with 24 exons covering 110 kb (3, 28). The exon 
organization of the EGF-like motifs of these three proteins is 
conserved. Exons 3 and 4 of AR encode the mature protein, 
with an intron disrupting the coding sequence between the 
second and third disulfide loops. Alignment of the amino 
acid sequences for AR, EGF, and TGF-a reveals an identi- 
cally placed intron in all three proteins (Fig. 7A). Moreover, 
the adjacent exon of each contains the transmembrane 
domains of the precursor proteins. This configuration is 
associated with secretion of an active EGF receptor-binding 
protein and suggests that the integral membrane form may be 
necessary for efficient folding of the disulfide bonds. In 
addition, recent evidence suggests that these transmembrane 
precursors may be biologically active even in the absence of 
processing (5, 75). In contrast to the pattern for the three 
growth regulators, the cysteine-rich domain is contained on 
a single exon in all other homologous mammalian proteins 
for which the exon structure has been determined, including 
the eight EGF-like repeats in the human EGF precursor (3), 
and three repeats in the low-density lipoprotein receptor 



(68), and the single repeat in the tissue-type plasminogen 
activator, urokinase, and each of the human coagulation 
factors IX, X, XII, and protein C (reviewed in references 12 
and 44). 

Differences in the structures of these EGF-like repeats 
suggest that they may have distinct origins. One group 
contains the motif bounded by introns, whereas the other 
group, of which AR, EGF, and TGF-a are members, has the 
motif interrupted by an intron at a precise location. The 
sequence similarity between the two groups could be the 
result of convergent evolution or could be due to insertion of 
a new intron subsequent to their divergence from a common 
ancestral gene. On the basis of the structural and functional 
analyses, we conclude that AR is the third member of the 
EGF/TGF-a family of growth modulators present in the 
human genome. 

The AR gene maps to a site of frequent chromosomal 
aberrations in acute leukemia. We have mapped the AR gene 
to chromosome region 4ql3-4q21. This region also contains 
the genes for melanoma growth-stimulatory activity (gro), 
platelet factor 4 (PF4), the gamma interferon-inducible factor 
IP-10, interleukin-8, statherin (a calcium-regulating salivary 
protein), albumin, and a-fetoprotein (13). The gene for EGF 
is located distally at 4q25, approximately 30 x 10 6 bp away, 
while the genes for c-kit and the platelet-derived growth 
factor A-type receptor are located closer to the centromere. 
gro y IP-10, and PF4 belong to a class of structurally related 
peptides that may constitute a family of growth factors 
clustered on the proximal part of the long arm of chromo- 
some 4 (56). AR shows no sequence similarity to this family 
of proteins. Although TGF-a shares 32 to 33% homology 
with AR and EGF, it is located on a separate chromosome at 
region 2pl3 (70). 

Chromosomal abnormalities are frequent in many human 
cancers. One-third of all cases of acute lymphoblastic leu- 
kemia involve specific translocations. The most common 
cytogenetic aberration in congenital acute lymphoblastic 
leukemia, t(4;ll)(q21;q23), involves the region near the AR 
gene (32). ALL with t(4;ll) is most common in infants under 
the age of 16 months, and this translocation identifies a group 
with a poor prognosis in need of aggressive therapy. Aber- 
rations involving region 4q21 have also been associated with 
T lymphomas (43), and with the piebald trait, an inherited 
disorder resulting in patchy skin pigmentation due to defi- 
cient melanoblast migration and differentiation (13). Linkage 
studies between AR and these genetic disorders will allow us 
to determine the significance of their colocalization. 

Regulation of AR expression. The tissue distribution of AR 
transcripts was distinct from that of EGF, but had some 
similarities with the distribution of TGF-a mRNA. EGF is 
expressed predominantly in the submaxillary glands, the 
distal tubules of the kidney, and lactating mammary gland, 
with lower levels in the pancreas, small intestine, and 
pituitary (54). TGF-a is expressed in a variety of tumors and 
retro virally transformed cells (15, 17) and in early fetal 
development and preimplantation embryos (55, 73). Initial 
attempts to identify a normal adult tissue source of TGF-a 
expression were unsuccessful, in part because of low mRNA 
abundance and cell type specificity (17). TGF-a has subse- 
quently been shown to play a role in the growth of normal 
cells, since its mRNA has been detected in normal and 
psoriatic adult keratinocytes, breast epithelial cells, pitu- 
itary, brain, ovarian thecal cells, seminiferous tubules, and 
activated alveolar macrophages (21, 46, 67, 73). Both TGF-a 
and AR are expressed in the normal ovary, testis, and breast 
tissue, and this paper shows that AR can also be detected in 
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the placenta, pancreas, heart, colon, lungs, spleen, and 
kidneys. This expression profile suggested that AR serves a 
functional role in the growth of normal tissues and that it 
may be involved in early development or in processes as 
diverse as gonadogenesis, hematopoiesis, and tissue model- 
ing and repair. 

Since AR was first discovered as a growth regulator from 
TP A- stimulated cells, we studied the mechanism of this 
induction. Several tumor cell lines showed increased AR 
mRNA after 24 h of treatment with TP A. This prolonged 
treatment with TP A was selected since induction of AR 
transcripts in MCF-7 cells peak at 24 h and their supernatant 
contains the largest amount of AR protein after a 48- h 
induction. On characterization of the 5' regulatory region of 
the AR gene, a promoter was identified that was TP A 
responsive in MCF-7 cells yet lacked the consensus TPA- 
responsive element sequence. Similarly, TGF-a expression 
has been reported to be induced by TPA in MDA-MB-468 
cells (4) and in human keratinocytes (53), yet its 5 '-flanking 
region also lacks a consensus TPA-responsive element (36). 
Maximal induction of AR was seen following prolonged 
treatment with TPA, conditions sufficient to induce down 
regulation of protein kinase C (60). Possibly more complex 
interactions are required for TPA induction of AR. A poten- 
tial cAMP-responsive element exists in the 5 '-flanking region 
of AR, and recent reports demonstrate that cross-talk and 
synergy occur between the protein kinase C and cAMP 
pathways (19, 51). cAMP alone can activate the 8-bp TPA- 
responsive element, but TPA has not been shown to interact 
with the 7-bp cAMP-responsive element (19). Evidence that 
phorbol esters stimulate cAMP accumulation provides a 
possible link between these two pathways (78). 

Conclusions. AR is unique among growth regulators in that 
it has the potential for two disparate mechanisms of signal 
transduction. First, like EGF, TGF-a, and other growth 
regulators, AR binds to a membrane receptor that phospho- 
rylates protein(s) in the cytoplasm of target cells and thus 
generates a second messengers) that regulates gene expres- 
sion in a spatial and temporal manner. Second, AR is 
endowed with nuclear-targeting motifs and has the capacity 
to interact directly with DNA; by this mechanism, gene 
expression could be controlled in a fashion similar to that 
observed with glucocorticoid, thyroid, and morphogen re- 
ceptors (22). 

The AR precursor may also exist as an integral membrane 
"receptor" and function in cell-cell interactions or in the 
regulation of cell growth and differentiation. Such mem- 
brane-bound forms might be involved in eliciting a program 
of growth regulation that is different from that elicited by the 
secreted proteins. An important direction for future research 
will be to define which residues are responsible for the 
differences in the activities of AR and EGF/TGF-a. One 
approach will be to generate chimeric and mutated forms of 
AR by using strategies analogous to homolog-scanning mu- 
tagenesis or by methods based on sophisticated computer 
modeling. These altered ligands will also allow us to further 
define the functional binding site by which this family of 
growth modulators interacts with specific cell surface recep- 
tors and transduces their biological effects. 
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THE LIST OF PEPTIDE GROWTH REG- 
uJators has been expanding rapidly. 
These factors participate in various 
physiological and pathological conditions, 
such as cellular communication, growth and 
development, embryogenesis, immune re- 
sponse, hematopoiesis, cell survival and dif- 
ferentiation, inflammation, tissue repair and 
remodeling, atherosclerosis, and cancer (1). 
The isolation, characterization, and mecha- 
nism of action of regulatory factors for 
growth and differentiation are of current 
interest because of thc potential use of such 
regulatory factors in the diagnosis, progno- 
sis, and therapy of neoplasia and because of 
what these factors reveal about the basic 
mechanism of normal cellular proliferation 
and the unrestrained growth of cancer cells. 
We have recently reported the isolation of a 
novel glycoprotein termed amphiregulin 
(AR), which inhibits growth of A431 hu- 
man epidermoid carcinoma and other hu- 
man tumor cells and stimulates proliferation 
of human fibroblasts and other normal and 
tumor cells (2). AR was isolated from sc- 
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rum-free conditioned medium of MCF-7 
human breast carcinoma cells that had been 
treated with 12-O-tetradecanoylphorbol- 
13 -acetate (2). We now report thc complete 
amino acid sequence of amphiregulin and 
compare its biological properties with those 
of the other members of the epidermal 
growth factor (EGF) family proteins. 

AR was purified to homogeneity as de- 
scribed (2). The homogeneous AR was used 
for all the chemical and biological studies 
reported here. The amino acid sequence of 
human AR (Fig. 1) was determined by 
automated Edman degradation of N-glycan- 
ase— treated, reduced, and S-pyridylcthylatcd 
AR (NG-SPE-AR) and of peptide frag- 
ments obtained by cleavage of NG-SPE-AR 
with various endopeptidases. The carboxyl- 
terminal analysis of NG-SPE-AR was per- 
formed with car boxy peptidase P (Petticillium 
janthinellum). The ammo-terminal analysis 
of NG-SPE-AR revealed the presence of 
two sequences, one starting at residue 1, 
serine, and the other starting at residue 7, 
valine (Fig. 1). The yield of thc larger form 
of AR was about 20% of that of thc truncat- 
ed form. The larger AR thus contains six 



additional amino acids at the amino terminal 
of the truncated form of AR. The larger 
form of AR and the truncated AR are single 
chain polypeptides of 84 and 78 residues, 
with a calculated molecular weight of 9759 
and 9060, respectively (Fig. 1). Both forms 
of AR have a similar carboxyi- terminal se- 
quence as determined by carboxypeptidase P 
cleavage (Fig. 1), and both are biologically 
active. 

The sequence of AR was compared with 
all proteins in the National Biomedical Re- 
search Foundation database (release 15, 
containing 6796 protein sequences), Genet- 
ic Sequence Data Bank (Bolt Beranek and 
Newman, Los Alamos National Laboratory; 
release 54) and thc European Molecular 
Biology Laboratory DNA sequence library 
(release 13). These computer-aided searches 
revealed that AR is a novel protein and a 
member of thc EGF family. This family 
includes EGF (mouse, human, and rat) (3- 
5), transforming growth factor-ct (TGF-ot) 
(6, 7), and poxvirus growth factors [vaccinia 
(VGF), myxoma (MGF), and Shope fibro- 
ma (SFGF)] (8-10). Tissue-type plasmino- 
gen activator (J J), thc mammalian clotting 
factors DC and X (12), the low-density lipo- 
protein receptor (13), bovine protein C (14), 
human proteoglycan core protein (15), 
product of Drosophila notch gene (16), prod- 
uct of lin 12 gene (17), the product of cell 
lineage— specific gene of sea urchin Strongylo- 
ctntrotus purpuratus (18), cytotactin (19), and 
product of PJs gene of Plasmodium falciparum 
(20) also contain EGF-like domains. Align- 
ment of AR structure with the structure of 
EGF-like growth factors and with other 
members of EGF-like proteins (Fig. 2) re- 
veals that AR, like other members of the 
family, contains the hallmark six essential 
cysteine residues, maintains conservation of 
cysteine residue spacing in the pattern 
CX7CX4CX10CX1CX8C, and also contains 
some of the characteristic and conserved 
amino acids. AR falls between thc members 
of the growth factor family that look like 
EGF and TGF-a and those that look like the 
poxvirus -encoded growth factors (MGF and 
SFGF), especially in the use of asparagine. 
The ammo- terminal sequence of AR has 
some analogy with the amino- terminal se- 
quences of the TGF-a's (6, 7), VGF (8), and 
MGF (9) in that it is rich in prolines, serines, 
and threonines and, like TGF-a and VGF, 
has potential N- linked glycosylation sites as 
well as the possibility for O-linkcd glycosyla- 
tion in the region rich in serines, threonines, 
and prolines. Unlike MGF and SFGF, AR 
docs not have any potential glycosylation 
site within thc growth factor domain of thc 
molecule. On the basis of homology with 
mouse EGF (3) and perfect alignment of six 
cysteine residues, one would expect the pres- 
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A Member of the Epidermal Growth Factor Family 
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The complete amino acid sequence of amphiregulin, a bifunctional cell growth 
modulator, was determined. The truncated form contains 78 amino acids, whereas a 
larger form of amphiregulin contains six additional amino acids at the ammo-terminal 
end. The amino-terminal half of amphiregulin is extremely hydrophilic and contains 
unusually high numbers of lysine, arginine, and asparagine residues. The carboxyi- 
terminal half of amphiregulin (residues 46 to 84) exhibits striking homology to the 
epidermal growth factor (EGF) family of proteins. Amphiregulin binds to the EGF 
receptor but not as well as EGF does. Amphiregulin fully supplants the requirement 
for EGF or transforming growth factor— a in murine kerarinocyte growth, but it is a 
much weaker growth stimulator in other cell systems. 
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cncc of three intrachain disulfide bonds in 
AR involving cysteine residues 46 and 59, 
54 and 70, and 72 and 81. 

AR is an extremely hydrophilic protein, 
especially the amino- terminal half of the 
molecule up to residue 45. A 23-amino acid 
stretch from residue 23 through 45 contains 



only five different amino acids (ten lysines, 
four arginines, four asparagines, three pro- 
lines, and two glycines). A tetrapeptide Arg- 
Lys-Lys-Lys is repeated twice (residues 26 
to 29 and 40 to 43) in AR. Such sequences 
have been reported to serve as a nucleus 
targeting signal (21). The hydropathy pro- 
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Flg. 1. Amino acid sequence of AR and schematic outline of the data supporting the sequence. The 
sequence of unfragmcntcd NG-SPE-AR is denoted by N.T. Peptides obtained by cleavage with 
endopeptidase-Lys-C (K), with endopcptidase-Arg (R), and with cndopeptidasc-Glu, Staphylococcus 
aureus V8 protease (E) are indicated. CPP denotes the carboxyl-terminai sequence determined by 
digestion of AR with carboxy peptidase P. Residues identified with Edman degradation or by amino 
acid analysis arc indicated by lines. Vertical bars show beginnings and endings of fragments. Lines 
without two vertical bars inclicate incomplete sequences; | indicates the start of truncated AR; and * 
indicates potential glycosylation site. AR was reduced with 2-mercaptoethanol and alkylated with 4- 
vinylpyridine. SPE-AR was purified by reversed phase high-performance liquid chromatography (rp- 
HPLC). SPE-AR was treated with N-elycanase to remove N-linked oligosaccharides, and NG-SPE-AR 
was purified by rpHPLC. NG-SPE-AR was cleaved with various endopeptidases, and the resulting 
peptides were separated by an rpHPLC Ca column. Peptide sequences were determined with an 
Applied Biosystems model 475A gas-phase sequencer. Identification of phcnylthiohydantoin amino 
acid derivatives was carried out, on line, on a model 120A analyzer (Applied Biosystems). For carboxyl- 
terminai analysis, NG-SPE-AR was incubated with CPP, portions were withdrawn at various times, and 
the reaction was terminated. Released amino acids were dcrivatized with phenyl isothiocynatc and 
phcnylthiocarbamyl amino acid dchvatcs were analyzed and quantitatcd by using micro amino acid 
derivatizcr and analyzer (Applied Biosystem, model number 420-A0-03). 



file of AR exhibits little similarity with those 
of the other members of the EGF family. 

The binding properties of AR were com- 
pared with those of mouse EGF in radio- 
receptor assays (Fig. 3A). AR inhibited the 
binding of l25 I-labelcd EGF to A431 cells as 
well as to A431 plasma membranes. A 50% 
inhibition of ,23 I-EGF binding to fixed cells 
and membranes was seen at about 1.1 and 
1.8 nM EGF, respectively, whereas a 50% 
reduction in EGF binding to cells and mem- 
branes was seen at approximately 1.8 and 
5.7 nAf AR, respectively. Unlabeled EGF 
completely inhibited the 125 I-EGF-rcceptpr 
interaction at higher concentrations in both 
systems (Fig. 3A). However, the maximum 
competition with AR was 75% and 50% for 
binding to cells and membranes, respectively 
(Fig. 3A). The competition curves for AR 
were not parallel to that seen with EGF. 
These results suggest that AR has a lower 
affinity for EGF receptors on A431 cells 
than docs EGF itself. Structural differences 
between AR and EGF might explain the 
binding data shown in Fig. 3. It is also 
possible that AR might have its specific 
receptor closely related to the EGF receptor. 

EGF or TGF-ct induce anchorage-inde- 
pendent growth of rat kidney cells NRK- 
SA6 in the presence of TGF-p (22). EGF 
induced anchorage-independent growth of 
NRK cells in a dose-dependent manner in 
the presence of TGF-p, whereas AR was 
found to be a noninducer of colony forma- 
tion in soft agar of NRK cells (Fig. 3B). The 
continued growth of a murine keratinocytc 
cell line, Balb/MK, is dependent on EGF or 
TGF-ct (23). Balb/MK cells did not prolifer- 
ate in the absence of AR or EGF. However, 
these cells proliferated equally well in the 
presence of AR or of EGF (Fig. 3C). Thus, 
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Rg. 2. Alignment of AR sequence with other EGF-like proteins. Amino 
acids arc represented by standard one-letter symbols (24). Only residues 
appearing in eight or more proteins arc boxed. Hyphens indicate gap 
introduced to maximize homology. Dots at the beginning and the end of 



sequences indicate the use of only partial sequence of a given protein. 
Numbers at the beginning of every sequence indicate the number of amino 
acid residues within the total protein sequence; i , beginning of the 
sequence of the truncated form of AR; and *, potential glycosylation site. 
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Fig. 3. (A) Competition of ,23 I-EGF binding to "» 
the fixed A431 cells (solid symbols) or A431 
plasma membranes (open symbols) by murine 
EGF and AR. EGF was radioiodinated with l2S l 
as described (25). The binding assays were per- 
formed either in 48-well tissue culture plates 
when Formalin-fixed A431 cells were used as 
described (26) or by immobilizing plasma mem- 
branes onto 96-well pbly(vinyl chloride) plates as 
described (27). The binding assays used 4 ng of 
l23 I-labcled mouse EGF per milliliter, containing 
-1.9 x 10 5 dx>m. Samples of 100 and 50 u.1 were 
used per wcu for assays with fixed cells and 
membranes, respectively. Circles indicate EGF, 
and triangles indicate AR. (B) Effect of EGF and 
AR on NRK-SA6 cell colony formation in soft 
agar in the presence of TGF-p (1 ng/ml). A 0.38- 
rnJ base layer of 0^5% agar (Agar Noble, Difco 
Laboratories, Detroit) in Dulbccco's minimum 
essential medium containing 10% heat- inactivat- 
ed fetal bovine scrum (FBS) was added to 24- well 
Costar tissue culture plates. A 0.3% agar (0.38 
ml) containing the same medium- FBS mixture, 
6 x JO 3 to 12 x 10 3 test cells, and the factors to 
be tested were overlaid on the basal layer of agar. 
The plates were incubated at 37°C in the humidi- 
fied atmosphere of 5% C0 2 in air. Colonics were 
enumerated unfixed and unstained, and the num- 
ber of colonies was scored between days 7 and 10. 
Colonies were defined as a cluster of at least eight 
cells. Circles, EGF; and triangles, AR. (C) Effect 
of AR and EGF on the growth of murine keratin - 
ocytes. Balb/MK cells were plated at 1 x 10 4 cells 
per well in 1 ml of low calcium medium (23) in 
2 4- well Costar plates (area —2 cm 2 per well) and 
incubated overnight at 37°C. Then media were 
removed and replaced with 1 ml of medium 
containing various concentrations of AR or EGF 
in triplicate. The control wells received only medi- 
um without any AR or EGF. Plates were incubat- 
ed at 37°C for 4 days, then medium was removed, AR or EOF Og"ni) 
wells were rinsed two times with 1 ml of phosphate-buffered saline, and the cells were detached with 
trypsin-EDTA and counted. Circles, EGF; and triangles, AR. 
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AR can supplant the EGF requirement in 
these cells. These results indicate that, like 
EGF and TGF-a, AR acts as a growth 
stimulator, but is much weaker on some cells 
(normal rat kidney) and comparable on oth- 
ers (murine keratinocytes). 

Available structural data should allow 
studies on the cloning, structure, topology, 
expression, and regulation of amphiregulin 
gene in both the physiological and patho- 
logical conditions. TTiesc studies may also 
provide clues to design agonists and antago- 
nists of this Afunctional growth regulator. 
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